WHUSUM: Wuhan University at the Update Summarization Task of TAC 2009

نویسندگان

  • Po Hu
  • Dong-Hong Ji
چکیده

This paper describes the system WHUSUM we developed to participate in the update summarization task of TAC 2009. Given a topic and corresponding topic statement, this year's task is to write 2 summaries (one for Document Set A and one for Document Set B) that meet the information need expressed in the topic statement. In order to generate a topic-oriented summary for Set A, We present a co-training based strategy to select the topic relevant sentences from two abundant views and adopt a graph-based ranking algorithm (i.e. GRASSHOPPER) to achieve both information richness and content diversity in the generated summary. Furthermore, to capture the novel information in Set B and remove the possible redundant information in historical Document Set A, we propose two approaches to encourage novelty. One is to incorporate similarity between sentences in historical set and current set in the prior ranking of GRASSHOPPER. Another is to directly rank sentences for Document Set B first, and then to adjust their ranking scores based on the content comparison between the relevant sentence sets in A and B. The official evaluation results show that our system gets competitive performance in general topic-oriented summarization task and ranks in the middle among 52 submitted systems in update summarization task, which demonstrate that there is still large room to improve the novelty detection mechanism of the system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WHUSUM Participation at TAC 2011 Guided Summarization Track

In this report, we present details about the participation of WHUSUM in the guided summarization track at TAC 2011. Guided summarization task requires participants to produce short, coherent summaries of news articles with the guidance of predefined categories and aspects for each category. This year, we extended our query-focused update summarization system with aspect related information. In ...

متن کامل

Description of the LIPN Systems at TAC2009

The Text Analysis Conferences (TAC) offer a unique occasion to show innovative approaches to text summarization. As a first incursion into this new research area, LIPN participated in the Update Summarization task of TAC 2008. The LIPN wanted to improve the results obtained during TAC 2008 and to confirm that the changes made to its summarization system really enhanced the quality of the automa...

متن کامل

Sentence Position revisited: A robust light-weight Update Summarization baseline Algorithm

In this paper, we describe a sentence position based summarizer that is built based on a sentence position policy, created from the evaluation testbed of recent summarization tasks at Document Understanding Conferences (DUC). We show that the summarizer thus built is able to outperform most systems participating in task focused summarization evaluations at Text Analysis Conferences (TAC) 2008. ...

متن کامل

The NTNU Summarization System at TAC 2009

In this paper, we presents the results obtained by using a probabilistic summarization framework for the TAC 2009 update summarization task, which has the merits of combining the sentence generative probability and the sentence prior probability for sentence ranking systematically. Especially, each sentence of a document to be summarized is treated as a probabilistic generative model for predic...

متن کامل

TAC 2009 Update Summarization of ICL

For the update summarization task of TAC 2009, we submitted two runs using two different methods. The first one is manifold ranking method, which models all sentences as a graph. The topic description is deemed as the only labeled node and assigned with an initial score, then the scores of all the sentences in the documents are learned by spreading the initial score on the graph. The second met...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009